Efficient Mining of High Average-Utility Itemsets with Multiple Minimum Thresholds

نویسندگان

  • Chun-Wei Lin
  • Ting Li
  • Philippe Fournier-Viger
  • Tzung-Pei Hong
  • Ja-Hwung Su
چکیده

High average-utility itemsets mining (HAUIM) is a key data mining task, which aims at discovering high average-utility itemsets (HAUIs) by taking itemset length into account in transactional databases. Most of these algorithms only consider a single minimum utility threshold for identifying the HAUIs. In this paper, we address this issue by introducing the task of mining HAUIs with multiple minimum averageutility thresholds (HAUIM-MMAU), where the user may assign a distinct minimum average-utility threshold to each item or itemset. Two efficient IEUCP and PBCS strategies are designed to further reduce the search space of the enumeration tree, and thus speed up the discovery of HAUIs when considering multiple minimum average utility thresholds. Extensive experiments carried on both real-life and synthetic databases show that the proposed approaches can efficiently discover the complete set of HAUIs when considering multiple minimum average-utility thresholds.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Algorithm for High Average-utility Itemset Mining

High utility itemset mining (HUIM) is a new emerging field in data mining which has gained growing interest due to its various applications. The goal of this problem is to discover all itemsets whose utility exceeds minimum threshold. The basic HUIM problem does not consider length of itemsets in its utility measurement and utility values tend to become higher for itemsets containing more items...

متن کامل

Discovery of High Utility Itemsets Using Genetic Algorithm with Ranked Mutation

Utility mining is the study of itemset mining from the consideration of utilities. It is the utility-based itemset mining approach to find itemsets conforming to user preferences. Modern research in mining high-utility itemsets (HUI) from the databases faces two major challenges: exponential search space and database-dependent minimum utility threshold. The search space is extremely vast when t...

متن کامل

Discovery of High Utility Itemsets Using Genetic Algorithm

-Contemporary research in mining high utility itemsets from the databases faces two major challenges: exponential search space and database-dependent minimum utility threshold. The search space is very huge when number of distinct items and size of the database is very large. Data analysts must specify suitable minimum utility thresholds for their mining tasks though they may have no knowledge ...

متن کامل

Mining High Utility Pattern in One Phase without Candidate Generation using up Growth+ Algorithm

Utility mining developed to address the limitation of frequent itemset mining by introducing interestingness measures that satisfies both the statistical significance and the user’s expectation. Existing high utility itemsets mining algorithms two steps: first, generate a large number of candidate itemsets and second, identify high utility itemsets from the candidates by an additional scan of t...

متن کامل

An efficient algorithm to mine high average-utility itemsets

With the ever increasing number of applications of data mining, high-utility itemset mining (HUIM) has become a critical issue in recent decades. In traditional HUIM, the utility of an itemset is defined as the sum of the utilities of its items, in transactions where it appears. An important problem with this definition is that it does not take itemset length into account. Because the utility o...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016